Readability Annotation: Replacing the Expert by the Crowd
نویسندگان
چکیده
This paper investigates two strategies for collecting readability assessments, an Expert Readers application intended to collect fine-grained readability assessments from language experts and a Sort by Readability application designed to be intuitive and open for everyone having internet access. We show that the data sets resulting from both annotation strategies are very similar. We conclude that crowdsourcing is a viable alternative to the opinions of language experts for readability prediction.
منابع مشابه
Focus Annotation of Task-based Data: Establishing the Quality of Crowd Annotation
We explore the annotation of information structure in German and compare the quality of expert annotation with crowdsourced annotation taking into account the cost of reaching crowd consensus. Concretely, we discuss a crowd-sourcing effort annotating focus in a task-based corpus of German containing reading comprehension questions and answers. Against the backdrop of a gold standard reference r...
متن کاملA Methodology for Corpus Annotation through Crowdsourcing
In contrast to expert-based annotation, for which elaborate methodologies ensure high quality output, currently no systematic guidelines exist for crowdsourcing annotated corpora, despite the increasing popularity of this approach. To address this gap, we define a crowd-based annotation methodology, compare it against the OntoNotes methodology for expert-based annotation, and identify future ch...
متن کاملUsing the crowd for readability prediction
Inspired by previous work on crowdsourcing we investigate two different methodologies to assess the readability of a wide variety of text material by implementing two assessment tools. A lightweight crowdsourcing tool which invites users to provide pairwise comparisons and a more advanced version where experts can rank a batch of texts based on readability. In order to validate this approach, r...
متن کاملText Readability within Video Retrieval Applications: A Study On CCTV Analysis
The indexing and retrieval of video footage requires appropriate annotation of the video for search queries to be able to provide useful results. This paper discusses an approach to automating video annotation based on an expanded consideration of readability that covers both text factors and cognitive factors. The eventual aim is the selection of ontological elements that support wider ranges ...
متن کاملFocus Annotation of Task-based Data: A Comparison of Expert and Crowd-Sourced Annotation in a Reading Comprehension Corpus
While the formal pragmatic concepts in information structure, such as the focus of an utterance, are precisely defined in theoretical linguistics and potentially very useful in conceptual and practical terms, it has turned out to be difficult to reliably annotate such notions in corpus data (Ritz et al., 2008; Calhoun et al., 2010). We present a large-scale focus annotation effort designed to o...
متن کامل